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ABSTRACT 

General Crop Estimation Surveys (GCES) Scheme is adopted in all the states of country to estimate crop yield 
at higher level (state, district). With progress in planning in agriculture, especially in case of the cotton crop, we need 
estimate of cotton crop yield at tehsil/hlock level with the desired degree of precision. Application of GCES as such, as 
tehsil/block level with the same number of crop cutting experiments (CCEs) may yield estimates with less degree of 
precision. If the simple crop-cutting approach is to be adopted directly for this purpose, the present number of crop 
cutting experiments will have to be increased significantly. In such case, use of information getting from auxiliary 
variable correlated with variable under study may increase the degree of precision of estimates at tehsil level. In this 
study, double sampling regression approach under stratified two stage sampling design framework has been proposed 
for estimation of the average yield of cotton at the tehsil level using the picking having highest correlation with total 
yield as auxiliary variable. 
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INTRODUCTION 

Cotton is often referred to be an important fiber yielding crop of global importance, which is grown in 
tropical and subtropical regions of more than 80 countries of the world. Cotton crop is harvested in the form of a 
number of pickings. Harvesting continues for a long time interval. The total number of pickings may vary from 
state to state. It varies from 2-3 pickings to 10 pickings. In irrigated crop, pickings are over in 3/4 pickings and 
crop is uprooted to accommodate the second crop whereas, in rain-fed areas the pickings are staggered over a 
period of time and depending on advent of rain when more flashes appear in the crop. 

Presently the estimates of the total yield of cotton are being obtained on the basis of scientifically 
designed crop cutting experiments (CCEs) conducted under the GCES. The sampling design followed under 
GCES scheme for crop cutting experiments in the states is stratified three stage random sampling with 
mandals/taluks/revenue inspector circles/blocks/tehsils as strata, villages within the stratum as first stage units 
(fsu), fields/survey numbers in the selected village as the second stage units (ssu) and plot of specified size within 
the selected field as third stage units. Since the third stage units (tsu) i.e. ultimate units of sampling are selected 
from the selected second stage units, i.e. plot of specified size within the selected field is selected as third stage 
unit, the sampling design adopted for crop cutting experiments may be considered as stratified two stage random 
sampling. 
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To increase in planning specially in the cotton crop, we need yield estimates for small areas. The GCES scheme is 
used to estimate the yield of cotton crop at higher level (state, district). If we apply GCES for yield estimation at the tehsil 
level with the same number of crop cutting experiments as in case of GCES then we may not get the desired degree of 
precision of estimate of the cotton crop. To get the desired degree of precision of estimate of the cotton crop, we have to 
conduct more number of crop cutting experiments in tehsil as compared to GCES. Thus, we have to incur more cost per 
tehsil as compared to GCES scheme. By incurring a little extra expenditure involved in collecting data on the auxiliary 
variable, it is possible to obtain reliable estimates of crop yield at the lowest/block level without disturbing the existing 
structure of data collection being used in the official statistical system. 

Estimation of unknown mean Y of the population more efficiently may be done through the use of auxiliary 

variable (X) which is correlated with variable under study and whose mean X is known. In this case, it is estimated using 
double sampling regression type of estimators. Double sampling is a sampling method which uses auxiliary information for 
improving the precision of the estimator. In double sampling approach, a large sample of units is drawn to obtain auxiliary 
information, and then a second sample is drawn where information on auxiliary variable and variable of interest are 
observed. The regression estimator utilizes information obtained from auxiliary variable. 

Ghosh (1948) gave linear regression estimator in double sampling using many auxiliary variables. Sukhatme and 
Panse (1951) investigated the magnitude of the bias from the data collected in district crop cutting surveys and had shown 
that bias is negligible. Utilizing the information on eye estimated yields and using the double sampling technique, ratio and 
regression estimates were obtained for the mean yield per block. Olkin (1958) was the first to consider the multi-auxiliary 
variables in building ratio estimator. Method of estimation suggested by Olkin was extended by Rao and Madholkar (1967) 
to a case where some of the auxiliary variables are positively correlated with the character under study and some are 
negatively correlated with the character under study. Sukhatme and Koshal (1959) discussed the ratio method of estimation 
in case of a single auxiliary variable with unknown population mean for multi-stage sampling design. Goswami and 
Sukhatme (1965) extended their results to several auxiliary variables with unknown mean for a three-stage sampling 
design. Panse et al. (1966) adopted the double sampling approach for estimation of block yield by treating farmer's eye 
estimate as auxiliary character and crop cut estimate as character under study. Srivastava (1971) studied the case when 
some of auxiliary variables are positively correlated and some are negatively correlated. The ICAR-IASRI initiated a pilot 
sample survey for estimating yield of cotton in Hissar (Haryana) during 1976-77 with the objective of developing sampling 
methodology for estimating the yield of cotton and suggesting a procedure for building up advance estimates of the yield of 
cotton on the basis of few pickings. Ahmad and Kathuria (2010) adopted double sampling approach to obtain the most 
reliable estimates of crop yield at the block level using farmer's eye estimate and area of the fields as auxiliary variables. 
Isabel Molina and J.N.K. Rao (2010) has shown some aspects of poverty estimation at small area level. Ahmad et al. 
(2013) developed an alternative sampling methodology for estimation of the average yield of cotton using double sampling 
technique under stratified two stage sampling design framework under the study entitled "Study to develop an alternative 
methodology for estimation of Cotton production of India" funded by DES, Ministry of Agriculture, Govt, of India. 
Saegusa (2015) developed the variance estimation procedure using the bootstrap under two-phase stratified sampling 
without replacement. Asghar et al. (2017) proposed a multivariate regression-cum-exponential type estimator for 
estimating a vector of population variance using multi-auxiliary variables in two-phase sampling. 

In this paper, an attempt has been made to obtain the reliable estimates at tehsil level using double sampling 
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regression approach under stratified two stage sampling design framework treating revenue circles as a strata, villages as 
first stage units, fields growing cotton crop within the selected village as second stage units and a randomly located plot of 
standard size in the selected field as the ultimate sampling unit and considering the picking having highest correlation with 
total yield of cotton crop as auxiliary variable. 

DATA 

Estimation of yield of cotton at tehsil level along with percentage standard error (%S.E.) using double sampling 
regression approach under stratified two stage sampling design framework has been carried out for tehsils of two districts 
namely, Amravati and Aurangabad districts of Maharashtra State for the year 2012-2013. The data available in the division 
of Sample Surveys, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, under the project entitled "Study to 
develop an alternative methodology for estimation of cotton production” has been used for this study. 

METHODOLOGY FOR ESTIMATION OF AVERAGE YIELD OF COTTON AT TEHSIL LEVEL 

The proposed estimation procedure for estimation of the average yield of cotton at the tehsil level is double 
sampling regression approach under stratified two stage sampling design framework using the picking having highest 
correlation with total yield as auxiliary variable. It has been observed during the previous study carried out by Ahmad et al. 
(2013) that third picking of the cotton crop in Maharashtra State has the highest correlation with total yield of cotton crop. 
Thus, by exploiting correlation between the data on picking having the highest correlation with total yield and the available 
CCE data, it is possible to obtain a precise estimate of crop production using less number crop cutting experiments than 
GCES. Therefore, to overcome these problems, we used third picking of cotton crop as an auxiliary variable to obtain 
estimates at the tehsil level of cotton crop with an acceptable degree of precision. Thus, by incurring a little extra 
expenditure involved in collecting data on the auxiliary variable it is possible to obtain reliable estimates of crop yield at 
the lower/block/tehsil level without disturbing the existing structure of data collection being used in the official statistical 
system. This study aims at generating precise estimates of crop yield at the block/tehsil level by exploiting the correlation 
between the total crop yield and yield of the picking (auxiliary variable) having the highest correlation with total yield of 
cotton crop. 

Under the proposed approach, from each stratum, a large sample of villages has been selected by Simple Random 
Sampling without Replacement (SRSWOR) for observing the yield of picking having the highest correlation with total 
yield. This preliminary sample of villages under the proposed double sampling approach under stratified two stage 
sampling design framework is the same number of villages selected under GCES scheme. From each selected village, four 
fields growing cotton have been selected by SRSWOR for observing yield pertaining to one picking (auxiliary variable, i.e. 
third picking) with the help of the existing field agency of State Government., if possible, or through full time ad-hoc staff. 
The proposed design requires CCE data from two additional fields from the selected CCE village besides CCE data from 
two fields of the same selected village. Out of these preliminary sample villages, a sub-sample of villages has been selected 
for observing yield pertaining to the remaining pickings. Out of the four fields selected for one picking, in each of these 
sub-sample villages, two fields have been selected for observing yield pertaining to the remaining pickings from CCE plot. 
The data on sub-sample pertaining to remaining pickings was collected with the help of the existing field agency of State 
Government as the data were to be collected from the significantly reduced number of fields under GCES set up. 

The estimation procedure for estimation of the average yield of cotton using double sampling regression approach 
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under stratified two stage sampling design framework is as under: 

Let 

L = Number of strata (revenue circles) in a tehsil 

N h = Total number of first stage units (fsu's), villages, in h lh stratum (h=l,2,...,L) 

n h ’ = Number of fsu's selected randomly in h lh stratum for observing yield for the p th picking (p=l,2,...,P) 

n h = Size of sub-sample selected randomly in h" 1 stratum for observing yield for the remaining pickings 

m’ = Number of second stage units (ssu's), fields, selected for observing yield for the p lh picking in i lh (i=l,2,..., 
n h ') village of h lh stratum 

m = size of sub-sample for observing yield for remaining pickings of these m' ssu's 

Yhij(p) = yield of cotton in j lh field (j=l,2,..., m') of i" 1 village in h' h stratum corresponding to p" 1 picking 

Estimator of average yield corresponding to p lh picking, y nm (p) for a tehsil is given by 

j l n h in l 

y nm (p) = - Where n o = &- 

ftp'll- h =l i=1 j =1 h =l 


Estimate of average yield, y nm , for a tehsil is given by 

p 

ynm -I y nm ( P ) 

p =i 


A double sampling regression estimator for estimation of average yield of cotton under the proposed framework 
can be written using double sampling procedure: 

yid ~ ymn + n ' m '(p^ y n m(pn 
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Where _/V = ZJV and ./V is the harmonic mean of N hs . This is the usual estimate in sub-sampling design. 
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In this investigation, we consider that within each stratum (revenue circle), some villages have been selected and 
these selected villages are the selected CCE villages. In order to apply double sampling regression approach under 
stratified two stage sampling design framework under a new setup, the selected CCE villages in revenue circle have been 
treated as preliminary sample of villages (n') for third picking yield. From each selected village 4 fields growing the cotton 
crop have been selected by SRSWOR for recording third picking yield. A sub-sample of 2 villages has been selected from 
preliminary sample villages for observing the remaining pickings. Out of 4 fields (m') selected for third picking yield, in 
each of these 2 villages, 2 fields have been selected for obtaining harvested yield from a randomly located plot of standard 
size in each field. The estimates of average yield of cotton along with % S.E. have been obtained for all the tehsils of 
Amravati and Aurangabad districts of Maharashtra State for the year 2012-2013 using the estimation procedure discussed 
above with the available data in the division of Sample Surveys, ICAR-Indian Agricultural Statistics Research Institute, 
New Delhi, under the project entitled "Study to develop an alternative methodology for estimation of cotton production". 

RESULTS AND DISCUSSIONS 

The data for all the tehsils of Amravati and Aurangabad districts of Maharashtra State for the year 2012-2013 
was analyzed as per proposed estimation procedure. The results of tehsil-wise analysis are presented in Table 1 and Table 
2 . 


Table 1: Estimates of Average Yield of Cotton (kg/ha) at Tehsil Level along with Percentage 
Standard Error for Amravati District of Maharashtra State for the year 2012-13 


S. No. 

Tehsil 

Cotton Yield (kg/ha) 

%S.E. 

1 

Amaravati 

691.40 

9.63 

2 

Bhatkuli 

589.11 

7.81 

3 

Nangaon 

515.02 

5.03 

4 

Chandur railway 

807.25 

3.24 

5 

Dhamangaon 

657.05 

9.07 

6 

Morshi 

760.37 

7.18 

7 

Tiosa 

553.75 

9.64 

8 

Warud 

652.41 

4.51 

9 

Chandur Bazar 

622.80 

6.30 

10 

Daryapur 

534.91 

4.51 

11 

Anjangaon 

641.80 

6.79 

12 

Achalpur 

583.80 

7.28 

13 

Dharni 

316.64 

4.96 

14 

Chikhaldara 

323.45 

5.91 


Table 2: Estimates of Average Yield of Cotton (kg/ha) at Tehsil Level along with Percentage 
Standard Error for Aurangabad District of Maharashtra State for the Year 2012-13 


S. No. 

Tehsil 

Cotton Yield (kg/ha) 

% S.E 

1 

Aurangabad 

122.02 

4.27 

2 

Phulambri 

207.12 

9.91 

3 

Paithan 

172.54 

8.06 

4 

Gangapur 

130.88 

9.36 
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\ Table 2: Contd., I 

5 

Vaijapur 

183.89 

9.84 

6 

Kannad 

261.88 

10.18 

7 

Khuldabad 

241.96 

4.47 

8 

Sillod 

142.10 

8.01 

9 

Soygaon 

268.58 

4.079 


It can be observed from Table I and Table 2 that in the case of the proposed methodology, namely, double 
sampling regression approach under a stratified two stage sampling design framework using third picking of cotton crop as 
auxiliary variable, the estimates of the average yield of cotton at the tehsil level has been obtained with less than 10% 
standard error for all the tissues of Amravati and Aurangabad districts, except canned tehsil of Aurangabad district. The 
%S.E. obtained for kannad tehsil is 10.18%, which is also around 10% only and is fairly reliable at tehsil level. The 
estimate with 10% standard error is considered reliable even at district level. 

CONCLUSIONS 

The results of the investigation demonstrate the feasibility of obtaining estimates of cotton crop yield at the tehsil 
level by adopting double sampling regression approach under stratified two stage sampling design framework using the 
picking having highest correlation with total yield of cotton crop as auxiliary variable. The method combines the 
information obtained from the third picking of cotton crop from a large number of villages (CCE villages) with yield of 
cotton crop obtained from sub-sample of these CCE villages. The estimates of the average yield of cotton at the tehsil level 
have been obtained with less than 10% standard error for tehsils of Amravati and Aurangabad districts. Therefore, it is 
recommended that double sampling regression procedure under stratified two stage sampling design framework using the 
picking having highest correlation with total yield of cotton crop as an auxiliary variable may be adopted for estimation of 
the average yield of cotton at lower level i.e. tehsil/Mandal/block/talk which will save the cost of the survey significantly. 
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